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Abstract 

Weighted histogram in Monte-Carlo simulations is often used for the estimation of 
a probability density function. It is obtained as a result of random experiment with 
random events that have weights. In this paper the bin contents of weighted his- 
togram are considered as a sum of random variables with random number of terms. 
Goodness of fit tests for weighted histograms and for weighted histograms with un- 
known normalization are proposed. Sizes and powers of the tests are investigated 
numerically. 
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1 Introduction 



A histogram with m bins for a given probability density function p{x) is used 
to estimate the probabilities 



Pi = p{x)dx, 2 = 1, . . . , m 



that a random event belongs to bin i. Integration in (1) is done over the bin 

A histogram can be obtained as a result of a random experiment with prob- 
ability density function p{x). Let us denote the number of random events 
belonging to the ith bin of the histogram as n,. The total number of events 
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in the histogram is cqiial to n = Z^IIli^i- The quantity pi = rii/n is an esti- 
mator of Pi with expectation value Epi = Pi. The distribution of the number 
of events for bins of the histogram is the muhinomial distribution [1] and the 
probabihty of the random vector (ni, . . . , rim) is given by 

p{m,...,nj= ■ Ep^ = i- (2) 

n\.lb2- • • • 'I'm- j=i 

The problem of goodness of fit is to test the hypothesis 

Ho:pi= pio, . . . ,pm-i = Pm-ift vs. Ha : Pi ^ Pio for some i, (3) 

where piQ are specified probabilities, and YiiLiPio = 1- The test is used in a 
data analyses for comparison theoretical frequencies npiQ with the observed 
frequencies n^. This classical problem remains of current practical interest. 
The test statistic 

h npio ^ ' 

was suggested by Pearson [2]. Pearson showed that the statistic (4) has ap- 
proximately a x^_i distribution if the hypothesis Hq is true. Improvements of 
the chi-square test were proposed in [3-5], also known are the likelihood ratio 
test [6] and an exact test [7] . Review and comparison of different multinomial 
goodness-of-fit tests was done in [8], a detailed numerical investigation has 
been given in [9]. 

Weighted histograms are often obtained as a result of Monte-Carlo simulations. 
References [10-12] are examples of research works in high energy physics, sta- 
tistical mechanics and astrophysics using such histograms. Operations with 
weighted histograms are realised in contemporary systems for data analysis 
HBOOK [13], Physics Analysis Workstation(PAW) [14] and ROOT frame- 
work [15], developed at CERN (European Organization for Nuclear Research, 
Geneva, Switzerland). 

To define a weighted histogram let us write the probability pi (1) for a given 
probability density function p{x) in the form 



Pi 

where 



/ p{x)dx = / w[x)g{x)dx, (5) 

J S-i J Si 



w{x) = p{x)/g{x) (6) 

is the weight function and g{x) is some other probability density function. 
The function g{x) must be larger than for points x, where p{x) ^ 0. Weight 
= if = [16]. 

The weighted histogram is obtained as a result of a random experiment with 
probability density function g{x) and weights of events calculated according 



2 



to (6). Let us denote the total sum of weights of events in the ith bin of the 
weighted histogram as 

rii 

Wi = J2wiik), (7) 

k=l 

where rii is the number of events at bin i and Wi{k) is the weight of A;th 
event in the ith bin. The total number of events in the histogram is equal to 
^ = 1 ^i, where m is the number of bins. The quantity pi = Wi/n is the 
estimator of pi with expectation value Epi = pi. Notice that in the case when 
g{x) = p{x) the weights of events are equal to 1 and the weighted histogram is 
the usual histogram. For weighted histograms again the problem of goodness 
of fit is to test the hypothesis 

Ho:pi= pio, . . . ,Pm-i = Pm-1,0 vs. Ha : Pi ^ Pio for some i, (8) 

where pio are specified probabilities, and J2iLiPio — 1- 

In practice the heuristic "chi-squarc" test statistic is used for this purpose 

i=l 



rii 



where 

W2i = J2Mk)^- (10) 

k=l 

It is expected that if hypothesis Hq is true then statistic has Xm^i distri- 
bution. The recommended minimal number of events in a bin is equal to 25 
for apphcation this test [13-15]. 

The next section of this paper proposes a generalization of the chi-square 
test for weighted histograms, a goodness of fit test for weighted histograms 
with unknown normalization is proposed in section 3. To evaluate the tests, 
in section 4 the sizes and powers of the tests are calculated for numerical 
examples with different numbers of events, bins and weight functions. The size 
of the test is compared with the calculated size of the heuristic chi-square test. 
The comparison demonstrates the superiority of the proposed generalization 
of chi-square test over the heuristic chi-square test. 



2 The test 



The total sum of weights of events in ith bin Wi, i = 1, . . . ,m can be considered 
as a sum of random variables 

rii 

Wi = Y.Wi{k), (11) 

k=i 
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where also the number of events rij is a random value and the weights Wi{k), k — 
1, ...,ni are independent random variables with the same probability distribu- 
tion function. The distribution of the number of events for bins of the his- 
togram is the multinomial distribution and the probability of the random 
vector {rii, . . . , rim) is 

P{m,...,nm)= , • E^^ = i> (12) 

Ii\.li2- ■ ■ ■ i'"m- i—\ 

where 

gi= g{x)dx, i^l,...,m (13) 

J Si 

is the probability that a random event belongs to the bin i. Integration in (13) 
is done over the bin Si. 

Let us denote the expectation values of the weights of events from the ith bin 
as Ewj = Hi and the variances as Varwj = af. The expectation value of the 
total sum of weights Wj, i — 1, . . . ,m is [17]: 

■rii 

EWj = E ^Wi{k) = EwiEm = nHiQi. (14) 

k=l 

The diagonal elements of the covariance matrix of the vector {Wi, . . . , W^) 
are equal to [17] 

lit = (^iQi^ + lA9i{^ - 9i)n = na2igi - ni^tgi, (15) 
where 0:21 — Ew^. The non-diagonal elements 7^^, i j are equal to: 

n n k I 

li3 = E E E [E E Wi{u)w^{v)]h{K - E WiE 

k=0 1=0 u=l v=l 
n n 

= E E E {wiWj)h{k, l)kl - Hingi/j^jugj (16) 

fc=o 1=0 

= l^ifJ,i{-gi9jn + gigjU^) - Hiugiiijugj 
= -rifiifijgigj, 

where h{k, I) is the probability that k events belong to bin i and I events to 
bin j. 

If hypothesis Hq is true then 

EWi^ n^igi = npio, i = 1, . . . , m (17) 

and 

gi^Pio/l^h i = l,...,m. (18) 
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We can substitute gi to (15) which gives 



7u = ^(— -Pio)> (19) 

where = Hi/a2i- Substituting into (16) gives 

7y = -npiopjo. (20) 

Notice that for usual histograms the ratio of moments is equal to 1 and 
the covariance matrix coincides with the covariance matrix of the multinomial 
distribution. 

Let us now introduce the multivariate Hotelling statistic 

(W-npo)'r^^(W-npo), (21) 

where 

W = {Wi,...,Wk-l,Wk+l,...,Wmy, Po = {PlO, ■ ■ ■ ,Pk-l,0,Pk+l,0, ■ ■ ■ ,Pmo)' 

and Tk = (7ij)(m-i)x(m-i) is the covariance matrix for a histogram without 
bin k. The matrix has the form 

,. / PlO Pk-1,0 Pk+1,0 PmO. , /^^n 

Tfe = diag (n — , . . . , n , n , . . . , n ) - npoPo, (22) 

and the Woodbury theorem [18] can be applied to find F^^. After that the 
Hotelling statistic can be written as 



-^k ~ I 

^ npio n - Ei^k riupio 



(23) 



and can be transformed to 



1 ^^l(.-E,,>r.mf _„ 

nf^f^ Pio n l-Ei^fcHPio 

that is convenient for numerical calculations. Asymptotically the vector W has 
a normal distribution Af{npQ, F].^^) [19] and therefore the test statistic (23) has 
Xm-i distribution if hypothesis Hq is true. Notice that for usual histograms 
when ri — 1, i = 1, . . . ,m the statistic (23) is Pearson's chi-square statistic. 
The expectation value of statistic (23) is equal to 

^^2_^^ r,EW^ , 1 - 2n Ei^k nE W^ + E (E.^. nW,f 
PiO n 1 - Ei^fe npio 

According (17) and (19) 

EW^^ npio/n - npl + n^-o (26) 



5 



and 



iy^fe iT^fe i^k i^k 



then 



E X| = m - 1 + (n - 1) ^ rjPio - n 

i^k 

1 - Ei^fe nPio 

= m — 1 

(28) 

as for Pearson's test [20]. 

Let us now replace with the estimate = Wi/W2i and denote the estimator 
of matrix as Ffe. Then for positive definite matrices F^, k — 1, . . . ,m the 
test statistic is given as 

Pio n 1- J2i^k riPio 



Formula (29) for usual histograms does not depend on the choice of the ex- 
cluded bin, but for weighted histograms there can be a dependence. A test 
statistic that is invariant to the choice of the excluded bin and at the same 
time is Pearson's chi square statistics for the usual histograms can be obtained 
as the median value of (29) with positive definite matrix Ffe for a different 
choice of excluded bin 

X^^Med {XlXl...,Xl}. (30) 

Usage of X'^ to test the hypothesis Hq with a given significance level is equiv- 
alent to making a decision by voting. 

Use of the chi-square tests is inappropriate if any expected frequency is below 
1 or if the expected frequency is less than 5 in more than 20% of bins [21]. This 
restriction known for usual chi-square test is quite reasonable for weighted his- 
tograms also and helps to avoid cases when matrix Fj. is not positive definite. 

Notice also that for case Wi = the ratio fi is undefined. The average value 
of this quantity for nearest neighbors bins with non-zero bin content can be 
used for approximation of undefined r^. 
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3 The test for histograms with unknown normaUzation 



In practice one is often faced the case that a histogram is defined up to an 
unknown normahzation constant C. Let us denote a bin content of histograms 
without normahzation as Wi, then Wi — WiC, and the test statistic (24) can 
be written as 

^.^c^^_m+ipz^^.„, (31, 



n 



with fi = Cfi- An estimator for the constant C can be found by minimization 
of (31). The normal equation for (31) has the form 

with two solutions 



{n-j:m), (33) 



\j:i^kfiW,^/Pio 

where Ck is an estimator of C. We choose the solution with the positive sign 

because it converges to a constant C = 1 for the case of a usual histogram, 
while the solution with negative sign does not. Substituting (33) to the (31) 
we get the test statistic 

+ (34) 
n f^^ pio nl-C^^j:._,^fiPio 



that has a Xm-2 distribution if hypothesis Hq is valid. Formula (34) can be 
also transformed to ^ 

Xl='- + 2s, (35) 

where 

lT/^^^T/i^^o-T.^iWi (36) 

y ij^k i^k i^k 

that is convenient for calculations. 

The final statistic is obtained by replacing fi in (35) with the estimate 
fi = Wi/W2i- As in chapter 2, a test statistic that is "invariant" to choice of 
the excluded bin can be obtained as the median value of (35) for all possible 
choices of the excluded bin 

i^ = Med (37) 
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Fig. 1. Probability density functions gi{x) =p{x), g2{x) and gsix) 



4 Evaluation of the tests' sizes and power 



The tests described herein is now evaluated with a numerical example. We 
take a distribution 



p{x) oc 



+ 



1 



{x - 10)2 + 1 {x- 14)2 + I 



(38) 



defined on the interval [4, 16] and representing two so-called Breight-Wigner 
peaks [22]. Three cases of the probabifity density function g{x) are consid- 
ered (see Fig.l) 



gi{x) =p(x) 



(39) 



92{x) = 1/12 



(40) 



gsix) oc 



[x -9)2 + 1 {x- 15)2 + 1 



(41) 



Distribution (39) gives an unweighted histogram and the method coincides 
with Pearson's chi square test. Distribution (40) is a uniform distribution on 
the interval [4, 16]. Distribution (41) has the same form of parametrization as 
(38), but with different values of the parameters. 

Sizes of tests for histograms with different numbers of bins were calculated for 
nominal values of size equal to a = 0.05 and for a nominal value of size equal 
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a=0. 05 




a=o.o^ 

0.03 r 




10^ 10^ 10 10^ 

Number of events in hlstogrom 



Fig. 2. Sizes of the chi-square tests for histograms with different weight functions 
and different numbers of bins as a function of the number of events n in the his- 
togram. Arrows show regions with appropriate number of events in histogram for 
test apphcation. 
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a=0. 05 
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a=0.01 




10 10" 

Nunnber of events In hlstogrom 



Fig. 3. Sizes of the heuristic chi-sqiiarc test for histograms with different weight 
functions and different numbers of bins as a function of the number of events n 
in the histogram. Arrows show regions with minimal number of events in bins of 
histograms equal to 25. 
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a=0. 05 




10 bins 
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10 10' 
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Fig. 4. Sizes of tests for histograms with unknown normahzation for different weight 
functions and different numbers of bins as a function of the number of events n in 
the histogram. 
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Fig. 5. Probability density function p{x) (solid line) and po{x) (dashed line) 



to a = 0.01 (Fig. 2). Calculations of sizes are done using the Monte-Carlo 
method based on 10000 runs. It can be noticed that relative deviation of sizes 
of tests are greater for a = 0.01 than for a = 0.05. All cases show that test 
sizes are close to their nominal values for large number n of events in the 
histogram, and are reasonably close to the nominal values for low statistics of 
events. The same computation was done for the size of the heuristic test (see 
Fig. 3). It can be noticed that for large number n of events the sizes of tests 
tend to the nominal value of the test. For small numbers n of events in the 
histograms the sizes of the tests are generally greater then the nominal values 
of tests, although some values of sizes are not shown on the figures because 
they are too big. Comparison of the two tests bring out clearly the superiority 
of the generalization of Pearson's test over the heuristic test. 

The same study was done for the chi-square test for histograms with unknown 
normalization. The results of these calculations are presented in Fig. 4. Again 

all cases show that tests sizes are close to nominal values for large numbers n 
of events and reasonably close to nominal values for low numbers of events. 

The powers of the new chi-square test and the test with unknown normal- 
ization were investigated for slightly different values of the amplitude of the 
second peak of the specified probability distribution function (see Fig. 5): 



The results of these calculations arc presented in Fig. 6. Notice that the pow- 
ers of the tests with unknown normalization are lower than powers of the 
normalized test. Comparison of the powers of the tests for probability density 
functions g2{x) and g3{x) with powers of the test for function gi{x) (usual 



Pq{x) oc 



2 1.15 



(42) 



(X- 10)2 + 1 (X- 14)2 + 1' 
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Fig. 6. Powers of chi-square tests for histograms with different weight fTinctions and 
different numbers of bins as function of the number of events n in the histogram, 
(a) chi-square test generahzation, (b) chi-square test for histograms with unknown 
normalization; a = 0.05 
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unweighted histogram) show that the values of powers are reasonable, as are 
the sizes of the new tests. 



5 Conclusions 

A goodness of fit test for weighted histograms is proposed. The test is a gener- 
alization of Pearson's chi-square test. Also a goodness of fit test for weighted 
histograms with unknown normalization is developed. Both tests are very 
important tools in the apphcation of the Monte-Carlo method as well as in 
simulation studies of different phenomena. Evaluation of the sizes and powers 
of those tests was done numerically for histograms with different numbers of 
bins, different numbers of events and different weight functions. The same in- 
vestigation was done for the heuristic test used often in practice. Comparison 
of the results shows the superiority of the new tests compared to the heuristic 
test. 
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